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1 . BACKGROUND 


A new stratification of the U.S. Great Plains (USGP) was developed for use in 
the Transition Year (TV) sample design. This stratification, based on soil, 
climate, and agricultural characteristics, was considered more efficient than 
stratification based entirely on political subdivisions. 

Soil characteristics were obtained from soil maps (refs. 1 and 2), and 
monthly average temperature and precipitation data obtained from the World 
Meteorological Organization were used to achieve the climatological classifi- 
cation of the area. The USGP was stratified into 27 agrophysical units (APU's) 
as shown in figure 1. Agriculture and nonagriculture areas for each APU ware 
delineated, using full-frame color infrared images. Segments containing 
5 percent or less agricultural area were defined as nonagriculture areas and 
were excluded from the sampling frame. 

As the APU's are generally larger than Crop Reporting Districts, the new strata 
can be. expected to be much less homogeneous than the counties which formed 
the basis of optimum sample allocation during Large Area Crop Inventory Experi- 
ment (LACIE) Phases I, II, and III. The questions of the extent to which 
strata homogeneity has been reduced and what benefits are derived from the^ 
new stratification approach thus arise. Besides leading to a natural strat- 
ification, the new approach is uniformly applicable in all countries and may 
provide q solution to the problem of optimum sample allocation in countries 
with no historical data at a lower political subdivision level . 

The strati-^lcation was made more efficient for sampling by considering the 
new set of strata obtained by the intersection of APU's with political sub- 
divisions in the country. As the state represents the size of a political 
subdivi sion 'for which historical crop information is likely to be available 
in a foreign country, the state was the political subdivision level consid- 
ered for intersection with APU's in the USGP. The strata obtained by this 
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Figure 1 . APU stratification of USGP. ORIGINAL PAGE IS 
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intersection were called refined strata and were assumed to be as homogeneous 
as the APU‘s containing them. 

Sample allocation in the US6P was nuide first at the APU level, using Neyiiian's 
optimum allocation procedure (ref. 3), and then for the refined strata within 
an APU, using proportional allocation based on the size of the agricultural 
area. The APU agriculture density with respect to the sampling unit (a 5- by 
6-nautical^mile-area segment) was used to estimate the within-APU variances 
for wheat or small grains and the historical wheat acreages for the APU's; 
both types of information were required to perform the sample allocation for 
the APU's. The historical wheat acreages for the APU's were obtained by 
aggregating such acreages for the refined strata, which were estimated by 
apportioning the state historical v;heat acreage to its refined strata on the 
basis of agricultural size. The sample allocation was made to achieve a 
specified precision for the wheat production estimate with cost minimized. 

The procedure required input for APU yields and their likely prediction errors 
to determine the total sample size and its distribution for the APU's, The 
yield information was assessed in terms of potential yield and a somewhat 
ad hoc procedure based on soil suitability for wheat and climate was used to 
generate the data needed (ref. 4). Further details on the stratification, 
sample allocation, and acreage estimation procedures are available in 
reference 5. 

An evaluation of the homogeneity of certain APU's in the USGP is reported in 
reference 6. This evaluation was made using the historical county data; it 
was concluded that APU's were generally not homogeneous with respect to 
wheat density. Apportionment was evaluated in this report and it was observed 
that although the apportioned estimate of refined strata historical wheat is 
not reliable, it has little effect by itself on the accuracy of the wheat 
acreage and production estimates. This conclusion and others stated in ref- 
erence 6 reflect negatively on the new stratification as well as on the sample 
design, but as the evaluations conducted and discussed in this reference 
corresponded to only a part of the USGP, they cannot be regarded as conclusive 
for the entire USGP. 
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This memorandum reports an evaluation of the TY-sample design as developed 
for the entire USGP, This evaluation was carried out using the LACIE 
Phase III .segment estimates, blind site data, and historical information. 


2. EVALUATION STUDIES 


Agriculture, density played a major role in the development of TY-sample 
design. It was assumed that wheat acreage was uniformly distributed over the 
agricultural area in an APU and in a state. Accordingly, if an APU was 
agriculturally homogeneous, it was considered homogeneous with respect to 
wheat. Also, the historical wheat acreages for refined strata in a state 
could be determined from the state historical wheat by apportioning the 
state wheat figure by the ratio of agricultural areas of the refined strata 
of that of the state. It is therefore important to evaluate both the strati- 
fication and the Sample allocation for the APU homogeneity and efficiency in 
sampling for wheat acreage estimation in the USGP. 

In this report, APU homogeneity is evaluated by assessing (1) Are the within- 
refined-strata variances for each APU the same? and, if so, (2) Are the 
refined strata means equal? The wheat acreage proportion or percentage, 
rather than wheat acreage in a segment, is considered as a variable in this 
discussion. The Bartlett test of homogeneity (ref. 7) is used to answer the 
first question and Fisher's F-test (ref. 7) is used to answer the second 
question, regarding each APU containing two or more refined strata. 

The X -approximation is considered for the distribution of the Bartlett test 
statistic (ref. 7). The test is first made for the homogeneity of strata 
variances; if homogeneity is not confirmed, no further test is performed and 
the ApU is regarded as heterogeneous. On the other hand, if there is no 
indication of heterogeneity, the F-test is conducted to assess the signif- 
icance of the difference between refined strata means. APU's showing a sig- 
nificant difference between strata variances and/or means are regarded as 
nonhomogeneous. 

The TY-sample allocation was based upon several assumptions and for it to be 
considered optimum, these assumptions must be satisified. In addition, input 
data in the allocation formula can make a significant difference if such data 
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contain many inaccuracies and errors. This could easily happen for the TY- 
saiiiple allocation because of the type of procedures used in generating data 
for the sampling frame, the strata variances, historical acreages, and yield 
potentials. Although all these issues should be addressed, at present the 
sample allocation i* evaluated by considering a different (and hopefully more 
reliable) set of strata variances and historical acreages. The Classifica- 
tion And Mensuration Subsystem (CAMS) estimates of segment wheat proportions 
obtained during LACIE Phase III provide a data set of much better quality than 
those from Phase II used for TY-sample allocation; therefore, these segments 
estimates form the basis of the data used for estimating strata variances 
and evaluation of sample allocation. Considering the LACIE Phase III segments 
to be randomly distributed, a poststratification of the segment estimates is 
considered for this evaluation. Next, a new historical data set is prepared 
for the APU's by aggregating county historical wheat acreage data. A rela- 
tive change in sample allocation caused by the use of aggregated county his- 
torical data versus the apportioned historical data for the APU's is assessed. 
There are several components to the evaluation issue being considered. These 
sub-issues were addressed as they arose during the evaluation work, and are 
discussed in the following sections. 
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3. DATA USED IN EVALUATION 


For the 27 APU's across nine states, table I shows the primary data set used 
in the evaluation studies. The yield potential data (APU mean yields and 
variances), total area, and total number of segments are those used in the 
TY~sarnple allocation. Phase III CAMS estimates of segment wheat proportions 
are used to estimate the wheat proportion means and variances for the APU's. 

A total of 446 segmei^t estimates were used for the USGP wheat acreage esti- 
mation during LACIE Phase III. These segments {i.e., segments for which 
CAMS estimates are available from Phase III) were poststratified and the first 
column under Phase III CAMS estimates (table I) gives the distribution of the 
segments for the APU's. No segment estimate was available for APU 5, and for 
APU's 103 and 2 only one segment estimate each was available. For the three 
APU's, variances could not be estimated directly; instead, the variances 
originally used in TY-sample allocation were substituted for these APU's in 
table I. 

Another set of data from Phase III was used in the present evaluation; ground 
truth was collected for 132 LACIE segments, called blind sites, ^ for which 
CAMS estimates were also available. However, the two blind sites from 
Oklahoma (segment numbers 1244 and 1365) were excluded because of an abnor- 
mality encountered in estimating their wheat acreages. (A large underesti- 
mation was caused by unavailability of certain temporal acquisitions necessary 
to determine adequate crop signatures.) In addition, no CAMS estimates were 
available for 14 blind site segments. The distributions of 130 blind sites 
for the APU's are given in table II for winter wheat region and in table III 
for spring wheat regions; blind sites from the mixed wheat region are also 
included. 


Blind site data are maintained by the Accuracy Assessment Group, Earth 
Observations Division, Lyndon B, Johnson Space Center, National Aeronautics 
and Space Administration, and are available from Dr. Dave i'^tts. Accuracy 
Assessment Manager. 
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TABLE I.-APU MEANS AND VARIANCES OF LACIE PHASE III ESTIMATES OF SEGMENT 
WHEAT PERCENTAGES, YIELDS, AND SIZE DATA 


Serial 

no. 

APU 

Total area 

Total no. 
segments 

Phase 

JI! CAMS cstittiatcs 

Yield potential^ 

No. 

Mean 

Variance 

Yield, 

bu/ac 

Variance 

1 

101 

172 064 

240 

2 

4.0 

21.1 

19.5 


2 

lii 

331 640 

535 

5 

11,9 

67.2 

25.0 


3 

19 

2 G81 

207 

1 

1.9 

23.0*^ 

32.0 

16.10 

A 

104 

726 012 

830 

10 

5.0 

19.5 

27.0 

12.35 

5 

2 

191 7.37 

247 

1 

10.8 

34.6'’ 

19,0 

6.35 

6 

3 

499 914 

558 

6 

15.1 

109.5 

10.5 

5. 98 

7 

4 

021 074 

542 

7 

24.6 

302,3 

20,4 

7.40 

U 

b 

465 309 

103 

0 

- 

61 7.3'’ 

19.5 

6.73 

0 

CO 

2 030 030 

lOli 

1/ 

21.1 

152.4 

24.5 

10.40 

10 

61 

436 269 

20« 

2 

17.5 

.18.7 

21.0 

7.05 

11 

7 

6 936 170 

659 

39 

41.5 

204.0 

26.0 

11.60 

1? 

li 

1 374 54.1 


12 

29.7 

103.1 

20.0 

13.10 

U 

9 

3 300 970 


31 

26.9 

222,5 

25.0 

10.85 

14 

10 

2 915 632 

71i0 

20 

22.4 

147.2 

25.5 

11.23 

10 ' 

11 

3 Tip 566 

721 

35 

19.5 

02.2 

31,5 

15.73 

16 

12 

1 02? 461 

29B 

21 

23.6 

232.0 

34.0 

17,60 

17 

13 

490 2!il 

266 

9 

12.5 

45.4 

32.0 

16.10 

U! 

14 

561 259 

209 

11 

13,2 

76.6 

40.0 

22.10 

19 

15 

1 112 420 

992 

21 

6.0 

06.5 

36.0 

19,10 

20 

16 

721 416 

596 

19 

4.6 

9.9 

27.5 

12.73 

21 

17 

640 344 

322 

6 

0.9 

54.9 

20.5 

13.48 

22 

TO 

235 !)P2 , 

205 

3 

4,9 

7.6 

22.5 

0.98 

23 

19 

6’ 02! 096 

1110 

62 

1».6 

91.1 

30.0 

14.60 

24 

20 

( 026 6/2 

630 

2'J 

25.6 

III.O 

,16.0 

1 9. ) 0 

20 

21 

6 260 212 

1229 

60 

16.1 

73.0 

26.0 

11.60 

26 

22 

73.3 494 

2/5 

16 

/.!! 

19.0 

24.0 

10.10 

27 

23 

2 60.1 f)«7 

541! 

7 

0.5 

29.0 

26.. 5 

11.90 


used for Uh; lY-Siiiiifile ei locotfon. 

*V<irinncc as oritjittally used in TY-sampie allocation. 
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TABLE II - GROUND-TRUTH ACREAGES OF LACIE PHASE Hi 
BLIND SITES FOR WINTER WHEAT REGION 


APU 

No, of 
segments 

Actual wheat acreage, « 

Average 

Variance 

101 

1 

6.5 


103 

1 

1.6 

- 

104 

4 

5.2 

4.53 

2 

1 

20.2 

- 

3 

1 

27.6 

- 

4 

3 

8.2 

9.96 

5 

1 

21.9 

- 

60 

6 

18.0 

316.13 

7 

15 

46.0 

294.32 

H 

2 

35.9 

62.72 

9 

12 

27.8 

116.62 

10 

13 

23.4 

126.50 

11 

11 

22.8 

104.38 

12 

6 

18.6 

374.38 

13 

2 

19.4 

44,18 

14 

3 

12.4 

29.14 

15 

5 

14.7 

138.34 

16 

4 

0.2 

0.10 

17 

3 

8.6 

108.76 

19 

4 

0.5 

0.48 

21 

6 

1.5 

5.59 

22 

2 

6.4 

80.64 

23 

3 



3.9 

6.88 




TABLE III.- GROUND-TRUTH ACREAGES OF LACIE PHASE III 
BLIND SITES FOR SPRING WHEAT REGION 


APU 

No. of 
segments 

Actual v/heat acreage, % 

Average 

Variance 

104 

3 

3.7 

34.09 

15 

2 

6.2 

56.18 

16 

4 

5.8 

15.75 

17 

2 

0.2 

0.12 

19 

19 

19.1 

142.55 

20 

n 

31.7 

197.73 

21 

22 

17.6 

128.03 

22 

2 

3.7 

24.50 

23 

3 

13.4 

221.22 
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4. NUMERICAL RESULTS 


4.1 APU HOMOGENEITY EVALUATIONS 

The average wheat proportions and the variance estimates for the refined 
strata are given in table IV. Computations are based on the CAMS estimates 
for the number of segments available for these refined strata (second column 
in table IV). For the APU having variance estimates available for two or more 
refined strata, the Bartlett statistic (ref. 7) was computed to test for the 
equality of refined strata variances. The computed statistics are given in 
the fifth column of table IV. Considering a 5-percent significance level for 
the test and a x^~dpproximation for the test statistics, it was found that 
APU's 15, 20, and 21 v/ere nonhoinogeneous with respect to their refined strata 
variability. APU 60 was also declared as nonhomogeneous when tested at the 
10-percent significance level. 

Another source of variat/on is the difference in refined strata means. To 
test for the equality of refined strata means for an APU, F-statistics (ref. 7) 
were computed {0:^ the APU's which were not declared heterogeneous by the test 
procedure {ihove. However, none of these APU's were found to contain refined 
strata with statistically significant difference in their means. Accordingly, 
the LACIE Phase III segment estimates show evidence of nonhomogeneity for 
APU's 60, 15, 20, and 21. Data evidence for nonhoinogenity is not very strong 
for APU 60, the only APU from the pure winter wheat region falling in the 
category of nonhomogeneous APU's. For tv/o of its refined strata, the vari- 
ance estimates are based on two or three segments and hence are not very 
reliable. Data evidence is much more reliable'and stronger in the case of 
AVU's 15 and 21 in the mixed wheat region, and APU 20 in the pure spring 
wheat region. 

4.2 EVALUATIONS BASED ON PHASE III BLIND SITE DATA 

For APU's with two or more blind sites available for estimating wheat pro- 
portions by CAMS, table V lists sample means and variances computed for the 
ground-truth wheat percentages, the CAMS estimated wheat percentages, and the 
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TABLE IV.- TESTS OF HOMOGENEITY OF VARIANCES AND MEANS 
FOR REFINED STRATA IN EACH APU 



‘’Ust two (liijits inclicato r.tato codO (fi<). 2 ). 
^^Slgm'ffc.itit li t lO-porcent level oT signiflcarieo. 
‘^SignifiCiint at 0-|ierc('.tit level of ‘Jigtvincanca, 
*^SiijnifiCtint at i-ijorccnt level of significanee. 


n 













TABLE IV Concluded 


Refined strata 

Number of 
CAMS segment 
estimates 

Average wheat 
percentage 

Variance 

estimate 

Oartlett 

statistic 

F-statistIc 


20 

24.6 

223.5 




1 

3.8 

- 




9 

12.5 

45.4 




0 

- 

- 




5 

13.2 

120.7 



1431 

6 

13.3 

56.7 

1.32 

0 

1520 

0 

- 

. 



152/ 

13 

1.4 

6.3 



15 31 

8 

15.7 

90.6 



1546 

0 


“* 

‘*32.92 


1631 

0 

- ■ 




1646 

19 

4.6 

9,9 



1731 

2 

16.5 

33.6 



1746 

4 

5.1 

22.2 

• 12 

2.09 

1«46 

3 

4.9 

7.6 



1927 

12 

10.9 

24.9 



1930 

29 

24.7 

53,4 



1946 

11 

10.8 

42.6 

4.55 

1.75 

2027 

20 

22.6 

116.8 



2038 

9 

32.1 

40.9 



2046 

■ 0 

•* 

“ 

‘^5.76 


2130 

1) 

20.7 

50.1 



2138 

32 

17.6 

59.0 



2146 

16 

7.4 

25. B 

‘^/•og 


2230 

14 

7.0 

15.6 



2238 

■ 2 

13,5 

20.5 



2246 

0 



.06 

2,06 

2330 

7 

8.5 

29.8 
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TABLE V.- APU BLIND SITE DATA ANALYSIS 
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difference between the two. Considering only those APU's which have three or 
more blind sites with CAMS estimates, the ground-truth wheat percentages are 
linearly regressed on their CAMS estimates. Coefficients for the regression 
equations and the residual mean-square errors (MSE) are also listed in table V. 
Except for APU's 19, 20, and 21 in the spring wheat region, and for APU's 7, 

9, 10, and 11 in the winter wheat region, the reliability of the regression 
equation is low. 

Based on ground- truth variance estimates for the APU's mentioned above, no 
significant difference exists between the APU variances in the spring wheat 
region or between the APU variances in the winter wheat region. Although 
the variance estimate of APU 7 appears fairly high compared to others in the 
winter wheat region, it is not statistically significant. Thus, blind site 
data for these APU's as well as from the remaining ones should be pooled and 
combined to obtain' one reliable regression equation for the winter wheat and 
one for the Spring wheat' region. 

If y is the ground-truth wheat percent and x is its CAMS estimate for a seg- 
ment, the two regression equations obtained by the least-square fit are 

y = 2.06 + 0.991X (1) 

for the winter wheat region with 77 data points, and 

•y = 2.46 + 1.09X (2) 

for the spring wheat region with 51 data points. Their respective residual 
MSE are 37.3 and 55.4 (see table V). 

Equations |1) and (2) should be regarded as calibration equations rather than 
regression equations. This distinction is necessary because the regression 
model assumes that the regressor (i.e., CAMS segment estimate) is error free, 
which is certainly not true. 
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The fonowiiKj conclusions are reached from the blind site data analysis given 
in table V; 

a. APU variances computed from CAMS estimates are consistently smaller than 
those computed from the ground-truth segment wheat acreages for the spring 
wheat. Although a similar tendency of the APU variance underestimation 
from the use of CAMS estimates appears for the winter wheat, it is not 
consistent over APU's as in the case of spring wheat. 

b. The regression of actual segment wheat percent on its CAMS estimate is 
significant. 

These results suggest that the CAMS segment estimates can be improved by the 
use of calibration equations (1) and (2). Thus, besides, the use of CAMS 
estimates which seem to underestimate the strata variances, segment wheat 
proportion estimates obtained from the calibration equations are used. It 
may be feasible to assess the impact of strata variance underestimation on 
the sample allocation. 

The segment wheat percent is predicted or estimated corresponding to its CAMS 
estimate from the applicable calibration equation, resulting in a new set of 
segment estimates, referred as a "calibrated" data set. Another data set 
obtained by replacing the calibrated estimate for a segment by its ground- 
truth wheat percent (when available) is then prepared to assess the likely 
impact on sample allocation due to underestimation of strata variances from 
the CAMS segment estimates. This dat set will be referred as "mixed," 

4.3 SAMPLE ALLOCATION EVALUATION 

The optimum sample allocation results obtained using the LACIE Phase III 
segments data of CAMS estimates, calibrated values, and mixed figures are 
given in this section. The TY-sample allocation formula described in ref- 
erence 5 is used. The optimum allocation formula is applied at the APU level 
and at the, refined stratum level; the latter case is to evaluate the propor- 
tion allocation used previously during TY. As considered in TY, the present 
sample allocation is determined by considering the 5-percent coefficient of 
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variation desired for the production estinwto with a rate of 75-pcrcont sample 
acquisition. The strata historical wheat acreages are obtained from the 1974 
agriculture census data in two different ways, apportioned from the states and 
aggregated from county data, following the procedures described in section 3, 
The apportioned historical wheat acreages for strata are exactly those used 
for the TY-sample allocation. 

4.3.1 ALLOCATION AT APU LEVEL 

Table VI lists the total sample size and its allocation among the 27 APU's in 
the USGP for each of the cases discussed above. The originaT TY-sample 
allocation figures are also listed. These evaluations lead to the following 
conclusions; 

a. The sample size determined by using the apportioned historical acreages 
is bn the average about 13 percent smaller than that obtained by using 
the aggregated county historical acreages in each case. 

b. Although the. total sample size for the original allocation appears satis- 
factory (487 versus 451 with the CAMS estimates, 469 with the calibrated 

data, and 514 with the mixed data - an RD of less than 10 percent), both 

significant underallocation and overallocation are observed for the 
individual APU's. The APU's showing undersampling are 4, 60, 9, 10, 13, 

17, and 20 and there is an oversampling for APU's 102, 2, 11, 14, 18, 

and 22. When compared with the sample allocation using aggregated county 
historical acreages, the original sample size 1§ consistently on the low 
side (487 versus 518 with the CAMS estimates, 538 with the calibrated 
data, and 593 with the mixed data), and thus thd underallocation for the 
TY-sample design may be as high as 20 percent. In addition to the APU's 
mentioned previously, two more APU's, 15 and 23, fall in the undersampling 
category; but APU 14 does not show any oversampling in this case. Thus, 
about 30 percent of the APU's are either undersampled or oversampled, 
according to the present evaluation, 

c. When the sample si^es for the three cases of CAMS, calibrated, and mixed 
data are compared, the results (table VI) show that the total sample size 
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TABLE VI.- APU SAMPLE ALLOCATION 


Al'U 

No, nf 
original 

Siiliipli' 

scgniOMts 

CAMS 

C,il ihraU'il 

lllxi’d 

A-' 

c’’ 

- 

A 

B 

iin, 

A 

B 

RD, *r. 

101* 

- 

3 

3 

0.0 

3 

3 

0.0 

1 

2 

-50.0 

102 

‘11 

13 

15 

-13.3 

14 

16 

-12.5 

14 

16 

-12,5 

103 

4 

4 

5 

-20 „0 

4 

5 

-20.0 

4 

5 

-20.0 

10-1 

19 

13 

15 

-13.3 

13 

IS 

-13,3 

14 

17 

-17.6 

2 

9 

4 

4 

0.0 

4 

4 

0.0 

4 

4 

0.0 

3 

18 

14 

16 

-12.5 

14 

16 

-12.5 

15 

18 

-16.7 

4 

7 

25 

29 

-13.8 

25 

29 

-13.8 

26 

30 

-13,3 

5 

7 

6 

7 

-14.3 

7 

8 

-12.5 

7 

8 

-12.5 

60. 

9 

12 

14 

-14.3 

12 

14 

-14.3 

14 

16 

-12.5 

61 

3 

4 

4 

0.0 

4 

4 

0,0 

4 

4 

0,0 

7 

37 

38 


-11.6 

38 

44 

-13.6 

39 

45 

-13.3 

« 

/ 

7 


-12.5 

1 

It 

-12,5 

7 

8 

-12.5 

g 

21 

31 

;if> 

-13, 9 

.11 

36 

-13.9 

32 

36 

-11.1 

ID 

21 

31 

36 

-13.9 

32 

3/ 

-13,5 

34 

39 

-12.8 

11 

3h 

?1 

31 

-12.9 

27 

31 

-12,9 

29 

34 

-17.6 

12 

2! 

20 

23 

-13.0 

20 

2.1 

-13,0 

24 

27 

-11.1 

13 

II 

/ 

9 

-22.2 

8 

0 

- 11.1 

6. 

7 

-14.3 

14 

1/ 

I3 

If) 

-13.3 

13 

1 5 

-I3.3 

14 

16 

-12,5 

1!) 

40 

43 

50 

-14.0 

43 

49 

-12.2 

42 

48 

-12.5 

16 

■ n 

/ 

II 

-12.!) 

8 

9 

-11.1 

9 

11 

-18.2 

1/ 

/ 

9 

10 

-10.0 

9 

10 

-10,0 

11 

13 

-15.4 

111 

-I 

?. 

2 

0.(1 

2 

2 

0.0 

2 

2 

0.0 

19 

50 

42 

48 

-12.5 

46 

54 

-14.0 

56 

64 

-12.5 

20 

25 

26 


-13.3 

29 

33 

-12.1 

35 

40 

-12.5 

21 

50 

36 

u 

-14.6 

40 

46 

-13.0 

40 

56 

-14.3 

22 

II 

4 

H 

0.0 

4 

5 

-20.0 

5 

6 

-16.7 

23 

11 

10 

D 

-16.7 

12 

13 

pH. 

18 

21 

-14.3 

Total 

487 

451 

.518 

-13,2 

469 

538 

-12.0 

514 

593 

-13.3 


'^A - SiiiHplo iillociilion for thu uiisq ol .ipporLlomul liintoffCiil wheat acreages, 

*’C '• Sainple allocation lor the i.ar.« ol aij(|rcgaLed county historical wheat acreages. 
'■RU - Relative difforonco, A - C. 

' c ■ 

"Not tnoludcif in the original a I locatioii, and only the refined stratum in Colorado 
is considered for the other three cases. 
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is approximately 3 percent higher for the calibrated case and 14 percent 
higher for the mixed data case than for the CAMS estimates case, a direct 
consequence of the underestimation of the refined strata variances shown 
by the blind site data analysis discussed earlier. Much larger differ- 
ences are noted in the spring wheat APU's (e.g., sample size of 42 vs. 56 
in APU 19, 26 vs. 35 in APU 20, 36 vs, 48 in APU 21 and 10 vs. 18 in 
APU 23) because of the significant underestimation of variances of APU's 
in the northern USGP. 

The present sample allocations show that APU's 101, 103, 2, 61, 18, and 22 
have been allocated five or less sample segments and thus at most three to 
four segments from an APU may be expected for data availability. The relia- 
bility of acreage estimates for these APU's will therefore be poor. One pos- 
sible way to improve the reliability is to merge these marginal wheat-growing 
APU's into other contiguous yet similar APU's. Assessing the similarity in 
terms of APU wheat acreage variances and their potential yield (table 1), 
these APU's were merged or combined with others as follows: {2, 3, 5}, 

{4, 61), {10, 101}, ni, 103} and {18, 22}. 

for APU 101, only its refined strata in Colorado is. merged with APU 10. The 
new stratification thus obtained for the USGP will be referred to as "merged 
APU's." 

The sample allocation for each of the. three data input cases discussed pre- 
viously was performed. The results for the sample size are listed in 
table Vn. Once again. the new sample size figures, and hence evaluations, 
parallel those reached .for the original APU stratification; for example, 

a. There is no significant difference for the total sample size between the 
original sample allocation and the present allocation based on apportioned 
historical data, but about 50 percent, of the APU's show either underallo- 
cation or overallocation. 
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TABLE VII.- MERGED APU SAMPLE ALLOCATION® 


APU 

No. of 
original 
sample 
segments 

CAMS 

Calibrated 

Mixed 

A 

C 

RD, 

A 

C 

RD, 

A 

C 

RD , 

102 

27 

14 

16 

n 

14 

16 

-12.5 

15 

17 

-11.8 

104 

19 

13 

15 

wm 

14 

16 

-12.5 

15 

17 

-11.8 

{2, 3, 5} 

34 

22 

26 

-15„4 

22 

26 

-15.4 

22 

26 

-15.4 

C4, 61} 

10 

32 

37 

-13.5 

32 

37 

-13.5 

33 

39 

-15.4 

60 

9 

12 

14 

-14.3 

12 

15 

-20.0 

14 

16 

-12.5 

7 

37 

39 

45 

-13.3 

39 

45 

-13.3 

40 

47 

-14.9 

8 

7 

7 

8 

-12.5 

7 

9 

-22.2 

7 

9 

-22.2 

9’ 

21 

32 

37 

-13.5 

32 

37 

-13.5 

32 

37 

-13.5 

n.o, loo. 

27 

42 

. 50 

-16.0 

42 

50 

-16.0 

45 

54 

-16.7 

01, 103} 

39 

37 

42 

-11.9 

37 

43 

-14.0 

41 

47 

-12.8 

• 12 

21 

21 

24 

-12.5 

21 

24 

-12.5 

24 

28 

-14.3 

13 

11 

8 

9 

-11.1 

8 

9 

-11.. 1 

6 

7 

-14.3 

14 

17 

13 

16 

-18.8 

14 

16 

-12.5 

14 

16 

-12.5 

15 

40 ■■ 

44 

• 51 

-13.7 

44 

51 

-13.7 

43 

50 

-14.0 

16 

13 

7 

8 

-12.5 

8 

9 

-11.1 

10 

IT 

-9.1 

.17 

7 

9 

10 

-10.0 

9 

n 

-18.2. 

11 

13 

-15.4 

19 

50 

43 

49 

-12.2 

48 

55 

02.7 

57 

66 

-13.6 

20 

25 

27 

31 

-12.9 

30 

34 

-11.8 

36 

41 

-12.2 

21 

50 

36 

42 . 

-14.3 

41 

47 

-12.8 

50 

57 

-12.3 

CVI 

CM 

CO 

12 

6 

7 

-14.3 

7 

8 

-12.5 

8 

9 

-11.1 

23 

11 

n 

12 

-8.3 

12 

14 

-14.3 

19 

22 

-13.6 

Total _ 

487 

475 

549 

-13.5 

493 

572 

I,,— , 

-13.8 

542 

629 

-13.8 


t3 

Merging of APU's is primarily based upon statistical and contiguous considerations. 
















b. The relative difference of the sample size obtained for the case of 
apportionment to that in the case of aggregated county historical dat’ 
is about -14 percent, 

c’. Approximately 14 percent more samples are needed for the mixed data case 
than for the CAMS estimates case. 

In addition to the suggested merging of some APU's, it is also proposed to 
divide the APU's that are assessed heterogeneous by Bartlett's test {sec- 
tion 4.1}. Considering the strata variance homogeneity and potential yield 
as the decision criterion, the follov/ing combinations of refined strata 
within APU's- are obtained as new APU's: {1527, 1546), (1531 , 1520}, {1927}, 

(1938, 1946}, {2038, 2046}, {2027}, {2130, 2138}, {2146}. (See figs. 1 and 2 
.for APU and state codes.) Although desirable to split APU 60, it was kept ■ 
intact to avoid having strata too small. This partition will be referred as 
"split and merged" APU stratification. Figure 3 shows the newly created 
APU's. ' 

The sample allocation results (table VIII) show that the original total sample 
size is quite adequate unless it is compared with the sample size for the 
mixed data case with aggregated county historical acreages (487 vs, 584). 

However, there are consistently significant underallocations and overalloca- 
tions for some APU's, as follows: 

Split and merged APU's 

102, 104, {2, 3, 5} 

13, 14, {1527, 1546}, 16, 1927, 

{1938, 1946}, {2038, 2046}, 

{2130, 2138}, {18, 22} 

{4, 61}, {60}, 9, {10, 101}, 

{1531, 1520}, 2027 


Category 

Overallocation 


Underallocation 
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TABLE VIII.- SPLIT AND MERGED APU SAMPLE -ALLOCATION 


APU 

No. of 
original 
sample 
segments 

CAMS 

Calibrated 

Mixed 

A 

C 

RD, % 

A 

C 

RD, % 

A 

C 

RD, ' 

102 

27 

13 

15 

BEa 

13 

15 

-13.3 

14 

16 

-12.5 

104 

19 

12 

14 

■H 

13 

15 

-13,3 

14 

17 

-17.6 

C 2, 3, 5} 

34 

21 

24 

-12.5 

21 

24 

-12.5 

21 

25 

-16.0 

{4, 61} 

TO 

30 

35 

-14.3 

30 

35 

-14.3 

32 

37 

-13.5 

60 

9 

12 

14 

-14.3 

12 

14 

-14.3 

14 

16 

-12.5 

7 

37 

36 

42 

-14.3 

37 

42 

-11.9 

39 

45 

-13.3 

8 

7 

7 

8 

-12. 

7 

8 

-12.5 

7 

8 

-12.5 

9 

21 

30 

35 

-14.3 

30 

35 

-14.3 

31 

36 

-13.9 

no , ion 

27 

39 

47 

-17.0 

40 

47 

-.4.9 

43 

52 

-17.3 

m , 103} 

39 

35 

40 

-12.5 

35 

40 

-12.5 

29 

45 

-13.3 

12 

21 

19 

22 

-13.6 

20 

23 

-13.0 

23 

27 

-14.8 

13 

n 

7 

8 

-12.5 

7 

8 

-12.5 

6 

7 

-14.3 

14 

17 

13 

15 

-13.3 

13 

15 

-13.3 

13 

15 

-13.3 

{1531, 1520} 

23 

27 

31 

-12.9 

27 

31 

-12.9 

31 

36 

-13.9 

{1527, 1546} 

17 

4 

5 

-20.0 

5 

5 

0 

5. 

5 

0 

16 

13 

6 

7 

-14.3 

7 

8 

-12.5 

9 

11 

-18.2 

17 

7 

9 

10 

-10.0 

9 

10 

-10.0 

n 

13 

-15.4 

1927 

8 

6 

7 

-14.3 

7 

8 

-12.5 

9 

11 

-13.9 

{1938, 1946} 

42 

29 

33 

-12.1 

32 

■37 

-13.5 

39 

46 

-15.2 

2027 

9 

14 

16 

-12.5 

16 

18 , 

-11.1 

21 

24 

-12.5 

{2038, 2046} 

16 

7 

8 

-12.5 

8 

. 9 

-11.1 

8 

10 

-20.0 

{2130, 2138} 

46 

25 

29 

-13.8 

28 

32 

-12.5 

38 

6 

-13.6 

2146 

4 

4 

4 

0 

4 

. 5 

-20.0 

5 

6 

-16.7 

{'(8, 22} 

12 

6 

7 

-14.3 

7 

8 

-12.5 

8 

9' 

-11 .1 

23 

11 

10 

12 

-16.7 


13 

-15.4 

18 

21 

-14.3 

Total 

487 

421 

488 

-13.7 


505 

-13.1 

499 

584 

-14,6 

















Note that the split APU 0527, 1546}, shows overal location, whereas the other 
part of APU 15, {1520, 1531} shows underallocation. Similarly, the two 
parts of the original APU 20 fall in both categories of allocation. 

The results of comparisons between different cases of data utilization are 
parallel with those obtained and discussed previously for the original APU 
or merged APU stratification. On the other hand, on a case-by-case basis, 
the present sample sizes for the original APU stratification are consistently 
higher than those for the split and merged APU stratification. It may there- 
fore be concluded that the latter stratification is more efficient than the 
original. Accordingly, had the TY-sample allocation performed optimally with 
respect to the split and merged APU stratification, the original sample size 
might have been smaller than 487. Although this would help in eliminating 
overallocation for some APU's, the underallocation v/ould become a larger 
problem. 

Based upon physical considerations (e.g., soil and topography), it seemed 
that APU homogeneity could not be extended to certain merged APU's. It was 
therefore decided not to merge APU's 61 and 4, 103 and 11, and 18 and 22. 

With this modification, the only cases of merged APU's remaining are 

{2, 3, 5} and {10, 101}. This stratification will be referred as "modified 

merged APU's." 

Sample allocation was performed for this new stratification; results given 
in table IX show that the figures lie between those obtained for the original 
and the merged APU's stratifications. Conclusions are again parallel with 
those derived in the other two cases: 

a. No significant difference in the total sample size, but sample sizes of 
50 percent of the APU's are affected considerably 

b. Underallocation by 13 percent with the use of apportioned historical 

data in sample allocation • ■ 
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c. Sample size for the mixed data case higher than that for the CAMS esti- 
mates case by 15 percent 

Next, considering the proposed split of APU*s for the modified merged APU 
stratification, the optimum allocation was performed (table X). The sample 
sizes for individual APU's were parallel with those obtained in the preceding 
two cases and the total sample size was smaller by about 7 percent than 
obtained for the split and merged APU stratification and by about 11 to 
15 percent than those in the case of merged APU stratification. Compared to 
the TY sample size of 487, except for the case of mixed data with aggregated 
county historical acreages, the sample sizes were lower, suggesting an over- 
allocation during TY. Significant underallocation and overallocation were 
again observed for about half of the APU's. However, this stratification 
suffers from having several small APU's which are allocated only a few sample 
segments each. When it becomes critical to use only the strata sample data 
for its acreage estimation, this stratification may not merit as much consid- 
eration as the merged or the split and merged APU stratification. 

The total sample sizes are plotted in figure 4 for the various data input 
case's corresponding to the original, merged, and split and merged APU strati- 
fication. As might be expected, the sample sizes for the- cal ibrated data 
case are only slightly higher than the corresponding ones for the CAMS esti- ■ 
mates case. However, use of the mixed data makes a significant difference in 
sample sizes and shows that the sample allocation is considerably affected 
due to underestimation of strata variance resulting from the CAMS segment 
estimates. The sample sizes obtained using the aggregated county historical 
acreages for strata are consistently higher than the corresponding ones in 
the case of apportioned historical acreages for the strata. 

It follows from the above results that both the use of apportionment for 
determining APU historical acreages and of CAMS segments, estimates for the APU 
variance estimation would lead to a smaller sample size for the sample alloca- 
tion when performed at the APU level. As both these factors were part of the 
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TABLE X.- SPLIT AND MODIFIED MERGED APU SAMPLE ALLOCATION 


APU 

Original 

sample 

segments 

CAMS 

Cali bra ted 

Mixed 

n 

C 

RD, % 

A 

C 

RD, % 

A 

C 

RD, % 

102 

27 

12 

14 


13 

11 


14 

16 


103 

4 

4 

4 


4 

H 


4 

5 


104 

19 

12 

14 


12 

14 


14 

16 


{2, 3, 5} 

34 

20 

23 


20 



20 

24 


4 

8 

23 

27 


24 



25 

29 


60 

9 

11 

13 


11 

13 


13 

15 


61 

3 

3 

4 


3 

4 


4 

4 


7 

37 . 

35 

41 


35 

41 


37 

43 


0 

7 

7 

8 


7 

8 


7 

8 


9' 

21 

29 

33 


29 

34 


30 

35 


{10, 101} 

27 

38 

45 


38 

45 


42 

50 


11 

35 

25 

29 


25 

29 


28 

32 


12 

21 

19 

22 


19 

22 


22 

25 


13 


7 

8 


7 

8 


6 

V 

jf 


14 

17 

12 

14 


12 

14 


13 

15 


{1531 , 1520} 

23 

26 

30 


26 

30 


30 

35 


{1527, 1546} 

17 

4 

5 


5 

5 

. 

6 

7 


16 

13 

6 

7 


7 

8 


9 

10 


17 

7 

8 



8 

10 


D 

12 


18 

3 




2 

2 



2 


{1927, 1946} 

14 

12 

14 


13 

15 


20 

24 


1938 

36 

14 

17 


16 

18 


16 

19 


2027 

9 

14 

16 


15 

17 


20 

23 


{2038, 2046] 

16 

7 

8 


7 

9 


8 

9 


(2130, 2138} 

46 

24 

28 


27 

31 


36 

42 


2146 

4 

4 

4 


4 

5 


5 

6 


22 

S 

4 

4 


4 

4 


5 

5 


23 

11 

10 

n 


11 

12 


17 

20 


Total 

487 

392 

455 

-13.8 

404 

468 

-13.7 

464 

539 

-13.9 
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allocation procedure for the TY-sample design, it is concluded that there may 
be an underallocation as high as 20 percent for the sample segments in the 
USGP during TY. 

4.3.2 ALLOCATION AT THE REFINED STRATA LEVEL 

To evaluate the proportional allocation employed at the refined strata level 
for the TY-sample design, the optimum sample allocation was performed for the 
refined strata using the data sets described previously. If less than two 
CAMS estmate were available for a refined stratum, it was merged with other 
refined strata In its APU and the APU variance estimate was used for each of 
the merged refined strata. Again considering different types of data to 
compute refined strata historical acreages and variances, the sample alloca- 
tion was, evaluated in each casej results are given in table XI. 

A comparison Detween the TY allocation and the optimum allocations shows that 
the TY has an higher sample size and hence is inefficient as compared to che 
optimum allocations obtained using the CAMS estimates data (33 percent), the 
calibrated data (29 percent), and the mixed data (11 percent), for the case 
of apportioned wheat acreages for the refined strata. Differences in sample 
sizes are smaller for the county aggreated wheat acreages.. Other conclusions 
are similar to those made previously for the APU-level sample allocation. Use 
of apportionment data leads to underallocation by about 13 percent. The 
refined strata showing significant sample overal location and underallocation 
are as follows: 

1 

Type Refined strata 

Overallocation 10220, 10240, 10430, 248, 340, 

1120, 1320, 1420, 1520, 1527, , 

1646, 1938, 2038, 2138, and 2230 • 

Underallocation 10108, 348, 448, 948, 1031,^ 

1131,^ V53T, 2027 

•s — — “ "" ‘ , . . . ' ! 

, The last two digits refer to a state code number (see fig, 2), 

^Applies only to the case of aggregated county historical acreages for refined 
strata. ■ 
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These results for overallocation and underaTlocation are obtained irrespective 
of the total optimum sample size. For example, although the total sample size 
in the case of mixed data with aggregated county historical acreages for 
refined strata exceeds the TY total sample size (this happens only in one case), 
the conclusions for the individual refined strata regarding underallocation 
or overallocation are the same as in the remaining cases. 

Considering the refined strata by states, these results suggests that there 
was overallocation in Kansas and North Dakota, and unrterallocation in Colorado, 
Nebraska, and Texas during TY. The underallocation in Colorado is partly due 
to noncoverage of APU 101 in the TY-sample allocation. 

Figure 4 also shows the optimum sample sizes obtained for the refined strata 
level. These sample size results are smaller than those obtained for the 
various APU stratifications. Although the implication is that the refined 
strata level stratification is more efficient than any one of the APU level, 

It has the drawback of having allocated few or no sample segments to some 
refined strata. 
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5. SUMMARY AND CONCLUSIONS 

The natural stratification and saitiple allocation used for the TY-sample design 
were examined. LACIE Phase Illdata were employed to test the APU homogeneity 
and to evaluate the optimum sample allocation when performed at both the APU 
level and the refined strata level. The effect of apportionment on the sample 
allocation was assessed by determining the relative change in sample size 
caused by use of the aggregated county historical wheat acreages in place of 
apportioned historical wheat acreages for the refined strata and APU's. The 
evaluations lead to the following conclusions: ’ 

a. APU's 15, 19, 20, and 21 are heterogeneous for wheat density and therefore 
must be further split to achieve a better stratification and more effi- 
cient sample allocation. The following split of the APU's is proposed. 

APU * Refined strata forming split APU's 

15 {1527, 1546} and (1531 , 1520} 

19 {1938, 1946} and {1927} 

20’ {2038, 2046} and {2027} 

21 {2130, 2138} and {2146} 

b. When the APU's that are either small in size or have marginal wheat are 
merged v/ith adjoining similar APU's, there is no significant increase in 
sample size. 

c. A more efficient stratification for sample allocation is achieved by 
merging and or splitting APU's*, see table VIII. 

d. The total sample size for. TY sampling seems adequate; however, the strata 
sample allocation is far from satisfactory. There is significant over- 
or underallocation of samples, affecting the sample allocation for about 
50 percent of the APU's. 

e. There is inadequate representation in sampling from some states. Colorado, 
Nebraska, Texas show an undersampling whereas Kansas and North Dakota have 
an oversampling during TY. The undersampling in Colorado is partly due 

to noncoverage of one of its refined stratum. Lack of full coverage 
generally results in a biased estimate. 
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f. When performed at the refined strata level, the optimum allocation leads 
to a saving of approximately one-third of the sample size obtained when 
it is performed at the APU level. However, the former may not be desir- 
able because few or no sample segments are allocated for some refined 
strata. Optimum sample allocation performed v/ith the split and merged 
stratification is recommended. 

g. Use of apportioned historical data versus the aggregated county historical 
data (which are more accurate figures for the refined strata and APU's) 
leads to a smaller sample size by about 13 to 15 percent. This suggests 
that apportionment based on agriculture density tends to mask the under- 
lying variability, and therefore its averaging effect leads to under- 
allocation of sample segments for the wheat production estimation. 

h. A similar averaging effect takes place when CAMS segment estimates are 
used in estimating the strata variances and then assessing the optimum 
sample size. This approach (i.e., use of CAMS segment estimates for 
strata variance determination) may lead to undersampling by as much as 

* ^0 percent. 

It is apparent that natural stratification is the first necessary step 
tOv/ard developing an efficient sample design for crop assessment of a large 
area. Natural stratification should be modified and updated to be applicable 
to specific crop types for an optimum sample design. Further, apportionment 
should not be based purely on agricultural density. Use of the historical 
data in estimation of the strata crop acreages can be avoided by developing 
a stratification which is efficient yet does not contain strata too small, 
either in total size or in crop size. 
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